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Zika  virus  (ZIKV)  is  currently  causing  an  unprecedented  pandemic  linked  to  severe  congenital 
syndromes1'2.  In  July  2016,  mosquito-borne  ZIKV  transmission  was  first  reported  in  the  continental 
United  States  and  since  then,  hundreds  of  locally  acquired  infections  have  been  described3.  To  gain 
insights  into  the  timing,  source,  and  likely  route(s)  of  introduction  into  the  United  States,  we 
tracked  the  virus  from  its  first  detection  in  Miami,  Florida  by  direct  sequencing  of  ZIKV  genomes 
from  infected  patients  and  Aedes  aegypti  mosquitoes.  We  detected  at  least  four  distinct  ZIKV 
introductions  and  estimate  that  11-52  introductions  contributed  to  the  outbreak  in  Florida. 
Furthermore,  our  data  suggests  that  ZIKV  transmission  likely  started  in  the  spring  of  2016  - 
several  months  before  initial  detection.  By  analyzing  epidemiological,  surveillance,  and  genetic 
data,  we  discovered  that  several  spatially  distinct  ZIKV  transmission  zones  were  likely  portions  of 
the  same  outbreak,  rather  than  isolated  events.  Our  analyses  show  that  most  introductions  are 
linked  to  the  Caribbean,  which  is  supported  by  the  high  incidence  rates  and  traffic,  especially  via 
cruises,  from  the  region  into  Miami.  By  comparing  mosquito  abundance  and  travel  capacity  across 
the  United  States,  we  find  that  southern  Florida  is  especially  vulnerable  to  ZIKV  introductions  and 
at  risk  of  repeat  occurrences.  By  tracking  the  virus  from  its  initial  introduction  into  the  United 
States,  we  provide  a  deeper  understanding  of  how  ZIKV  initiates  and  sustains  transmission  in  new 
regions. 

Local  ZIKV  transmission  was  first  reported  in  Brazil  in  May  20 154,  though  it  was  likely  introduced  to  the 
Americas  1-2  years  prior  to  its  detection5.  By  January  2016,  the  ZIKV  epidemic  had  spread  to  several 
South  and  Central  American  countries  and  most  Caribbean  islands6.  Like  dengue  virus  (DENV)  and 
chikungunya  virus  (CHIKV),  ZIKV  is  vectored  primarily  by  Aedes  mosquitoes7-10.  The  establishment  of 
the  peridomestic  mosquito  Ae.  aegypti  in  the  Americas11  has  facilitated  DENV,  CHIKV,  and  now  likely 
ZIKV12  to  become  endemic  in  this  region.  In  the  United  States,  transient  outbreaks  of  DENV  and  CHIKV 
have  been  reported  in  regions  of  Texas  and  Florida13-17  with  abundant  seasonal  Ae.  aegypti 
populations11'18. 

In  July  2016,  the  first  locally-acquired  ZIKV  cases  in  the  continental  United  States  were  detected  in 
Miami,  Florida3.  The  outbreak,  which  generated  256  confirmed  locally  acquired  ZIKV  infections  in  2016, 
was  likely  initiated  by  one  or  more  of  the  many  travel-associated  infections  in  Florida  (Fig.  la).  While 
transmission  was  confirmed  across  four  counties  in  Florida  (Fig.  lb),  the  ZIKV  outbreak  was  the  most 
intense  in  Miami-Dade  County  (244  infections).  Although  the  location  of  exposure  could  not  be 
determined  in  all  cases,  at  least  114  (45%)  of  the  infections  were  likely  acquired  in  one  of  three  distinct 
transmission  zones:  Wynwood,  Miami  Beach,  and  Little  River  (Fig.  lc-d). 

We  obtained  mosquito  surveillance  data  from  Miami-Dade  County’s  Mosquito  Control  Division  (Fig.  lc- 
d  and  Extended  Data  Fig.  1).  Of  24,351  mosquitoes  (sorted  into  2,596  pools)  collected  throughout  Miami- 
Dade  County  from  June  to  November,  2016,  99.8%  were  Ae.  aegypti.  Eight  of  these  pools  tested  positive 
for  ZIKV  RNA,  all  of  which  were  Ae.  aegypti  collected  in  Miami  Beach  (Fig.  lc).  From  these  pools,  we 
estimated  that  ~1  out  of  1,600  Ae.  aegypti  mosquitoes  were  infected  (0.061%,  95%  Cl:  0.028-0.115%, 
Extended  Data  Fig.  la),  which  is  similar  to  some  reported  Ae.  aegypti  infection  rates  during  DENV  and 
CHIKV  outbreaks19.  Even  though  ZIKV -infected  mosquitoes  were  not  detected  in  transmission  zones 
outside  Miami  Beach  (Fig.  lc),  we  found  that  the  number  of  human  ZIKV  cases  correlated  strongly  with 
Ae.  aegypti  abundance  within  each  of  the  defined  transmission  zones  (Spearman  r  =  0.61,  Fig.  Id, 
Extended  Data  Fig.  le).  This  suggests  that  the  primary  mode  of  transmission  throughout  Miami  was  via 
Ae.  aegypti  mosquitoes  and  that  mosquito  abundance  changes  directly  impacted  human  infection  rates. 
Therefore,  the  application  of  insecticides3,  which  we  found  to  suppress  local  mosquito  populations  during 
the  periods  of  intensive  usage  (Extended  Data  Fig.  lc),  likely  contributed  to  the  clearance  of  ZIKV  within 
the  Miami  area. 

To  gain  molecular  insights  into  the  Florida  outbreak,  we  sequenced  39  ZIKV  genomes  directly  from 
clinical  and  mosquito  samples  without  cell  culture  enrichment2"  (Supplementary  Table  1).  Our  Florida 
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ZIKV  dataset  (36  genomes)  included  29  genomes  from  patients  with  locally-acquired  infections  (Fig.  Id) 
and  7  from  Ae.  ciegypti  mosquito  pools  (Fig.  lc).  We  also  sequenced  3  ZIKV  genomes  from  travel- 
associated  cases  diagnosed  in  Florida.  Our  dataset  included  cases  associated  with  all  three  transmission 
zones  in  Miami  (Fig.  Id)  and  represented  -11%  of  all  confirmed  locally-acquired  cases  in  Florida.  We 
made  all  sequence  data  openly  available  at  NCBI  (BioProjects  PRJNA342539,  PRJNA356429)  and  other 
online  resources  shortly  after  data  generation. 

We  reconstructed  phylogenetic  trees  from  our  39  ZIKV  genomes  along  with  65  published  genomes  from 
other  affected  regions  to  infer  viral  genealogies  (Fig.  2,  Extended  Data  Fig.  2  and  3).  We  found  that  the 
Florida  ZIKV  genomes  formed  four  distinct  lineages  (FI  -  F4.  Fig.  2a).  three  of  which  (FI  -  F3)  belong  to 
the  same  major  clade  (clade  A,  Fig.  2a).  We  only  sampled  a  single  human  case  each  from  the  F3  and  F4 
lineages,  consistent  with  limited  transmission  (Fig.  2a).  The  other  two  Florida  lineages  (FI  and  F2)  were 
derived  from  several  human  and  mosquito  samples  from  Miami -Dade  County  (Fig.  2b). 

We  estimated  the  number  of  introductions  responsible  for  the  locally  acquired  cases  observed  in  our 
dataset  using  time-structured  phylogenies21.  Lineage  F4  resulted  from  an  independent  introduction  based 
on  phylogenetic  placement  within  a  distinct  major  clade  (Fig.  2a).  We  estimated  the  time  of  the  most 
recent  common  ancestor  (tMRCA)  to  be  in  the  summer  of  2015  for  the  two  well-supported  nodes  (A  and 
B)  linking  lineages  FI,  F2,  and  F3  (Fig.  2a).  The  tMRCA  estimates  were  robust  across  a  range  of 
different  molecular  clock  and  coalescent  models  (Extended  Data  Table  1).  This  suggests  that  while  F1-F3 
belong  to  the  same  clade,  any  less  than  three  distinct  introductions  leading  to  these  lineages  would  likely 
have  required  undetected  transmission  of  ZIKV  in  Florida  for  approximately  one  year  (Fig.  2a). 

To  estimate  the  likelihood  of  a  single  transmission  chain  persisting  for  over  a  year,  we  modeled  epidemic 
spread  under  different  assumptions  of  the  basic  reproductive  number  (R0).  We  found  that  R{)  of 
approximately  0.6  was  most  consistent  with  locally  acquired  and  travel  associated  case  counts  in  Miami- 
Dade  County  (Extended  Data  Fig.  4).  With  RQ  of  0.6,  we  determined  that  the  probability  of  a  single 
transmission  chain  persisting  for  over  a  year  in  Florida  is  extremely  low  (<  0.0001,  Fig.  2c).  This  is 
especially  true  considering  that  Ae.  oegypti  abundance  is  low  in  Florida  during  the  winter  months 
(Extended  Data  Fig.  If).  Given  the  low  probability  of  long  term  persistence,  we  expect  that  each  sampled 
ZIKV  lineage  in  Florida  (F1-F4)  was  likely  the  result  of  a  separate  introduction.  Differences  in 
surveillance  practises  across  areas,  combined  with  the  high  number  of  travel-associated  cases  in  Florida 
(Fig.  la),  likely  mean  that  unsampled  ZIKV  introductions  have  also  contributed  to  the  size  of  the 
outbreak.  Therefore,  to  estimate  the  total  number  of  underlying  ZIKV  introductions,  we  modeled 
epidemic  dynamics  resulting  in  244  locally  acquired  cases  within  Miami-Dade  County,  and  found  that 
with  R0  of  0.6  we  expect  31  (95%  Cl  11-52)  separate  introductions  to  have  driven  the  Miami-Dade 
outbreak  (Fig.  2d).  The  majority  of  these  introductions  would  likely  have  generated  a  single  secondary 
case  that  was  undetected  in  the  genetic  sampling  (Extended  Data  Fig.  4a). 

The  two  main  ZIKV  lineages,  FI  and  F2,  included  the  vast  majority  of  genomes  we  sampled  in  Florida 
(92%,  Fig.  2a).  We  investigated  the  timing  of  introduction  of  these  lineages,  and  found  that  the 
probability  densities  for  the  tMRCAs  of  both  FI  and  F2  were  centered  around  March-April,  2016  (Fig. 
2b,  Extended  Data  Fig.  3,  95%  Bayesian  credible  intervals  (BCI):  January-May,  2016).  This  finding 
suggests  that  ZIKV  transmission  in  Florida  started  at  least  two  months  prior  to  its  first  detection  in  July 
2016  (Fig.  la). 

To  investigate  if  the  ZIKV  transmission  zones  within  Miami  were  connected,  we  analyzed  our  genomic 
data  together  with  case  investigation  data  from  the  Florida  Department  of  Health  (DOH;  Supplementary 
Table  1).  While  spatially  distinct,  the  ZIKV  transmission  zones  occurred  within  '3  miles  of  each  other 
(Fig.  lc)  and  we  found  that  the  ZIKV  infections  associated  with  each  zone  overlapped  temporally  (Fig. 
Id).  Our  22  ZIKV  genomes  with  zone  assignments  all  belonged  to  lineages  FI  and  F2,  but  neither  of 
these  lineages  were  confined  to  a  single  transmission  zone  (Fig.  2b).  In  fact,  we  detected  both  FI  and  F2 
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lineage  viruses  from  Ae.  aegypti  collected  from  the  same  trap  26  days  apart  (mosquitoes  5  and  8,  Fig.  2b). 
The  temporal  overlap  of  ZIKV  cases  and  the  detection  of  genetically  similar  viruses  across  zones  suggest 
that  the  ZIKV  transmission  zones  within  Miami  were  part  of  the  same  outbreak,  rather  than  independent 
events. 

ZIKV  transmission  is  not  likely  to  persist  through  the  winter  in  Florida22  (Fig.  2c,  Extended  Data  Fig.  If); 
therefore,  determining  the  sources  and  routes  of  ZIKV  introductions  could  help  to  avert  future  outbreaks. 
We  found  that  lineages  F1-F3  clustered  with  ZIKV  genomes  sequenced  from  the  Dominican  Republic 
and  Guadeloupe  (Extended  Data  Figs  2  and  3),  suggesting  a  Caribbean  source  for  these  viruses  (Fig.  2). 
In  contrast,  the  F4  lineage  clustered  with  genomes  from  Central  America  (Extended  Data  Figs  2  and  3). 
Overall,  our  findings  suggest  that  while  ZIKV  outbreaks  occurred  throughout  the  Americas,  the 
Caribbean  islands  appear  to  have  been  the  main  source  of  ZIKV  that  resulted  in  local  transmission  in 
Florida.  Because  of  the  severe  undersampling  of  ZIKV  genomes  from  across  the  Americas,  however,  we 
cannot  rule  out  other  source  areas  or  panmixis  of  the  ZIKV  population  in  the  region.  Similarly,  even 
though  we  found  that  the  Florida  ZIKV  genomes  clustered  together  with  sequences  from  the  Dominican 
Republic,  our  results  should  not  be  interpreted  to  indicate  that  ZIKV  entered  Florida  from  this  country. 

We  investigated  ZIKV  infection  rates  and  travel  patterns  to  corroborate  our  phylogenetic  evidence  for 
Caribbean  introductions  into  Florida.  We  found  that  the  Caribbean  islands  bore  the  highest  ZIKV 
incidence  rates  (Fig.  3a,  Extended  Data  Fig.  5,  Supplementary  Table  2),  despite  Brazil  and  Colombia 
reporting  the  highest  absolute  number  of  ZIKV  cases  around  the  likely  time  of  most  introductions  into 
Florida  (January- June,  2016).  Over  the  same  time  period,  travelers  from  the  Caribbean  accounted  for  56% 
(~3  million)  of  the  total  traffic  into  Miami,  with  the  vast  majority  (-2.4  million)  arriving  via  cruise  ships 
(Fig.  3b,  Extended  Data  Fig.  6,  Supplementary  Table  2).  Combining  the  infection  rates  with  travel 
capacities,  we  estimated  that  -60-70%  of  ZIKV  infected  travelers  entering  Miami  arrived  from  the 
Caribbean  (Fig.  3c  and  Extended  Data  Fig.  7a).  In  support  of  our  estimates,  we  found  that  the  observed 
number  of  travel-associated  ZIKV  cases  correlated  strongly  with  the  expected  number  of  importations 
from  the  Caribbean  (Spearman  r  =  0.8,  Fig.  3d,  Extended  Data  Fig.  7b)  and  45%  of  the  people  infected 
abroad  reported  recent  travel  to  the  Caribbean  (Fig.  3e).  Notably,  23%  of  the  infected  travelers  diagnosed 
in  Miami  visited  Nicaragua  (Central  America).  Because  there  are  no  ZIKV  genomes  sampled  from  this 
country,  we  cannot  determine  the  relationship  between  the  outbreaks  in  Nicaragua  and  Florida.  Taken 
together,  these  findings  suggest  that  a  high  incidence  of  ZIKV  in  the  Caribbean,  combined  with  frequent 
cruise  ship  travel  from  the  region,  could  have  played  key  roles  in  the  establishment  of  ZIKV  transmission 
in  Florida  in  the  spring  of  2016.  These  findings,  however,  should  not  be  interpreted  to  indicate  that  cruise 
ships  themselves  are  risk  factors  for  human  ZIKV  infection,  but  only  that  they  serve  as  a  major  mode  of 
transportation  from  areas  with  active  transmission. 

The  vast  majority  of  the  ZIKV  outbreak  in  the  United  States  occurred  in  Miami-Dade  County  (95%  of 
reported  local  cases,  Fig.  lb).  To  investigate  if  ZIKV  transmission  was  more  likely  to  become  established 
in  Miami  than  in  other  parts  of  Florida,  we  analyzed  Ae.  aegypti  abundance  and  incoming  passenger 
traffic  across  all  Florida  ports.  During  January  to  April  2016  when  the  earliest  ZIKV  introductions  likely 
occurred  (Fig.  2b),  we  estimated  that  mosquito  abundance  was  higher  near  Miami  compared  to  other 
regions  of  Florida  (Fig.  3f,  Extended  Data  Fig.  If);  however,  by  June,  all  Florida  counties  were  predicted 
to  support  relatively  high  Ac.  aegypti  populations  (Fig.  3f).  In  addition,  we  found  that  ports  in  Miami  and 
nearby  Fort  Fauderdale  received  -72%  of  the  total  flight  and  cruise  traffic  entering  Florida  (Fig.  3f, 
Extended  Data  8a).  Therefore,  we  expect  a  higher  likelihood  of  ZIKV  introductions  and  transmission  in 
the  greater  Miami  area,  due  to  the  high  volume  of  incoming  traffic,  dense  human  population,  and  high 
early-season  Ae.  aegypti  abundance.  These  findings  could  also  explain  why  United  States-based  outbreaks 
of  DENV  and  CHIKV  have  mostly  occurred  in  Miami  and  surrounding  areas16'17. 

We  analyzed  travel  data  from  passengers  arriving  from  ZIKV-afflicted  regions  and  seasonal  Ae.  aegypti 
abundances lx  to  determine  if  other  regions  in  the  continental  United  States  were  at  risk  for  ZIKV 
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outbreaks.  We  found  that  Florida  received  8  times  more  cruise  ship  (Extended  Data  Fig.  8b)  and  1.5  times 
more  flight  passengers23  per  month  than  any  other  state.  In  addition,  southern  Florida  supports  high  Ae. 
aegypti  populations  most  of  the  year1118  (Extended  Data  Fig.  8c).  This  helps  explain  why  Miami  is  one  of 
the  regions  in  the  United  States  most  at  risk  for  Ae.  aegypti- borne  viruses,  including  ZIKV,  DENV,  and 
CFIIKV24.  Outside  of  Florida,  major  port  cities  such  as  Houston,  New  York,  and  New  Orleans  received  a 
high  proportion  of  international  traffic23  (Extended  Data  Fig.  8b),  support  seasonal  Ae.  aegypti 
populations18  (Extended  Data  Fig.  8c),  and  therefore  may  be  at  risk  of  transient  ZIKV  outbreaks.  In  the 
last  20  years,  however,  the  only  area  in  the  continental  United  States  outside  of  Florida  to  report  Ae. 
aegypti- borne  virus  transmission  is  along  the  United  States-Mexico  border  near  Brownsville,  TX25.  This 
area  also  reported  local  ZIKV  transmission  in  201626,  which,  as  suggested  for  the  2005  DENV  outbreak17, 
may  have  been  initiated  by  land  travel  from  Mexico23.  Therefore,  Caribbean  cruises  may  only  be  an 
important  mode  of  ZIKV  introductions  in  Florida  (Fig.  3,  Extended  Data  Fig.  8c). 

The  extent  of  ZIKV  transmission  in  Florida  was  unprecedented.  There  were  more  reported  ZIKV  cases  in 
2016  (256)  than  all  DENV  cases  observed  since  2009  (135)13.  Given  that  the  majority  of  ZIKV  infections 
are  asymptomatic2  27,  these  case  numbers  are  likely  underestimated.  Despite  this,  we  predict  that  R0  was 
less  than  1  and  multiple  introductions  were  probably  necessary  to  fuel  an  outbreak  of  this  magnitude22. 
Human  air  travel  is  often  implicated  in  the  spread  of  mosquito-borne  viruses28,  but  here  we  suggest  that 
cruise  ships  traveling  from  Caribbean  islands  with  intense  ZIKV  transmission  may  have  provided  a 
substantial  supply  of  the  ZIKV  infections  that  contributed  to  the  Florida  outbreak.  Our  virus  genetic  data 
also  link  three  of  the  four  observed  ZIKV  introductions  to  the  Caribbean  islands;  though  because  of 
undersampling  in  the  region,  we  cannot  pinpoint  exact  sources.  ZIKV  is  likely  to  become  endemic  to  the 
Americas12,  and  because  Miami  supports  relatively  large  Ae.  aegypti  populations11'18,  is  densely 
populated,  and  is  a  hub  for  international  travel23,  we  expect  that  future  ZIKV  introductions  into  Florida 
will  occur.  However,  the  magnitude  of  future  outbreaks  is  likely  to  be  limited  in  the  absence  of  repeated 
introduction  events. 
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Methods 

Ethical  statement 

This  study  has  been  evaluated  and  approved  by  Institutional  Review  Boards  (IRB)  at  The  Scripps 
Research  Institute  (TSRI).  The  study  was  reviewed  by  the  Florida  DOH  Human  Research  Protection 
Program  and  the  US  Army  Medical  Research  Institute  of  Infectious  Diseases  (USAMRIID)  Office  of 
Human  Use  and  Ethics  and  determined  not  to  require  IRB  review. 

Florida  Zika  virus  case  data 

Weekly  reports  of  international  travel-associated  Zika  fever  cases  in  Florida  and  ZIKV  infected  cases 
acquired  in  Florida  were  obtained  from  the  Florida  DOH  mosquito-borne  disease  surveillance  system13. 
Dates  of  symptom  onset  from  the  Miami  transmission  zones  (i.e.  Wynwood,  Miami  Beach,  and  Little 
River)  determined  by  the  Florida  DOH  investigation  process  were  obtained  from  the  ZIKV  resource 
website29  and  daily  updates30.  International  travel-associated  ZIKV  case  counts  in  the  United  States 
(outside  of  Florida)  were  obtained  from  The  Centers  for  Disease  Control  and  Prevention  (CDC)31;  due  to 
some  reporting  discrepancies,  however,  the  travel-associated  case  numbers  for  Florida  were  obtained 
from  the  Florida  DOH  and  not  the  CDC. 

Clinical  sample  collection 

Clinical  samples  from  locally-acquired  ZIKV  infections  were  collected  from  June  22  to  October  1 1,  2016. 
The  Florida  DOH  identified  persons  with  compatible  illness  and  clinical  samples  were  shipped  to  the 
Bureau  of  Public  Health  Laboratories  for  confirmation  by  qRT-PCR  and  antibody  tests  following  interim 
guidelines3'32  34.  Clinical  specimens  (whole  blood,  serum,  saliva,  or  urine)  submitted  for  analysis 
were  refrigerated  or  frozen  at  ^-70°C  until  RNA  was  extracted.  RNA  was 
extracted  using  the  RNAeasy  kit  (QIAGEN),  MagMAX  for  Microarrays  Total  RNA  Isolation  Kit 
(Ambion),  or  MagNA  Pure  LC  2.0  or  96  Systems  (Roche  Diagnostics).  Purified  RNA  was  eluted  into  50- 
100  pL  of  a  low-salt  buffer,  immediately  frozen  at  <  -70°C,  and  transported  on  dry 
ice.  The  Florida  DOH  also  provided  investigation  data  for  these  samples,  including  symptoms  onset 
dates  and,  when  available,  assignments  to  the  zone  where  transmission  likely  occurred  (Supplementary 
Table  1). 

Mosquito  collection  and  entomological  data  analysis 

24,351  Ae.  aegypti  and  Ae.  albopictus  mosquitoes  (sorted  into  2,596  pools)  were  collected  throughout 
Miami-Dade  County  during  June  to  November,  2016  using  BG-Sentinel  mosquito  traps  (Biogents  AG). 
Up  to  50  mosquitoes  of  the  same  species  and  sex  were  pooled  per  trap.  The  pooled  mosquitoes  were 
stored  in  RNAlater  (Invitrogen),  RNA  was  extracted  using  either  the  RNAeasy  kit  (QIAGEN)  or 
MagMAX  for  Microarrays  Total  RNA  Isolation  Kit  (Ambion),  and  ZIKV  RNA  was  detected  by  qRT- 
PCR  targeting  the  envelope  protein  coding  region34  or  the  Trioplex  qRT-PCR  kit35.  ZIKV  infection  rates 
were  calculated  per  1,000  female  Ae.  aegypti  mosquitoes  using  the  bias-corrected  maximum  likelihood 
estimate  (MLE)36.  Vector  index  was  calculated  by  multiplying  the  Ae.  aegypti  abundance  per  trap  night 
with  the  estimated  proportion  of  infected  mosquitoes37.  MLE  infection  rate  and  vector  index  could  only 
be  calculated  from  Miami  Beach  because  it  was  the  only  region  with  mosquitoes  confirmed  to  contain 
ZIKV  RNA.  Ae.  aegypti  abundance,  MLE  infection  rate,  and  vector  were  calculated  using  a  sliding 
window  of  three  days  with  a  step  of  one  day  (i.e.  mean  and  95%  Cl  of  day  before  and  after)  to  smooth 
some  of  the  natural  variability  of  daily  entomological  measures.  Days  of  insecticide  usage  by  the  Miami- 
Dade  Mosquito  Control  were  inferred  from  the  zone-specific  ZIKV  activities  timelines  published  by  the 
Florida  DOH29.  Daily  minimum  temperatures  for  Miami  were  acquired  from  NOAA’s  National  Centers 
for  Environmental  Information38. 
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Relative  monthly  Ae.  aegypti  abundance 

For  the  purpose  of  this  study  we  used  previous  modelling  outputs  from  Kraemer  et  al."  reapplied  the 
statistical  relationships  between  mosquito  presence  and  its  environmental  correlated  to  monthly  covariate 
data  to  generate  seasonally  varying  maps  of  probability  of  occurrence  at  5  km  x  5  km  global  grids39. 

Following  Flwang  et  al.40  we  use  a  simple  mathematical  formula  to  transform  the  probability  of  detection 
maps  into  mosquito  abundance  maps.  In  order  to  do  so,  assume  P  (Y=l)  where  Y  is  a  binary  variable 
(presence/absence).  Using  a  Poisson  distribution  X()  to  govern  the  abundance  of  mosquitoes,  the 
probability  of  not  observing  any  mosquitoes  can  be  related  to  the  probability  of  absence  as: 
P(X=0)=P(Y=0).  Now  we  use  the  following  transformation  to  generate  abundance  estimates  per  county  in 
Florida: 

Zika  virus  quantification 

ZIKV  genome  equivalents  (GE)  were  quantified  by  qRT-PCR.  At  TSRI,  ZIKV  qRT-PCR  was  performed 
as  follows:  ZIKV  RNA  standards  were  transcribed  from  the  ZIKV  NS5  region  (8651-9498  nt)  using  the 
T7  forward  primer  (5’  -  TAA  TAC  GAC  TCA  CTA  TAG  GGA  GA  +  TCA  GGC  TCC  TGT  CAA  AAC 
CC  -  3’),  reverse  primer  (5’  -  AGT  GAC  AAC  TTG  TCC  GCT  CC  -  3’),  and  the  T7  Megascript  kit 
(Ambion).  For  qRT-PCR,  primers  and  a  probe  targeting  the  NS5  region  (9014-9123  nt)  were  designed 
using  the  ZIKV  isolate  PRVABC59  (GenBank:  KU501215):  forward  primer  (5’-  AGT  GCC  AGA  GCT 
GTG  TGT  AC  -  3’),  reverse  primer  (5’  -  TCT  AGC  CCC  TAG  CCA  CAT  GT  -  3’),  and  FAM-fluorescent 
probe  (5’  -  GGC  AGC  CGC  GCC  ATC  TGG  T  -  3’).  The  qRT-PCR  assays  were  performed  in  25  pi 
reactions  using  the  iScript  One-step  RT-PCR  Kit  for  probes  (Bio-Rad  Laboratories  Inc.)  and  2  pi  of 
sample  RNA.  Amplification  was  performed  at  50°C  for  20  min,  95  °C  for  3  min,  and  40  cycles  of  95  °C 
for  10  s  and  57°C  for  10  s.  Fluorescence  was  read  at  the  end  of  the  57°C  annealing-extension  step.  10-fold 
dilutions  of  the  ZIKV  RNA  transcripts  (2  pl/reaction)  were  used  to  create  a  standard  curve  for 
quantification  of  ZIKV  GE/pl  of  RNA. 

ZIKV  GE  were  quantified  at  USAMRIID  using  the  University  of  Bonn  ZIKV  envelope  protein  (Bonn  E) 
qRT-PCR  assay41.  RNA  standards  were  transcribed  using  an  amplicon  generated  from  a  ZIKV  plasmid 
containing  T7  promoter  at  the  start  of  the  5’UTR  region.  The  plasmid  was  designed  using  the  ZIKV 
isolate  Beh8 19015  (GenBank:  KU365778)  and  the  amplicon  included  nts  1-4348,  which  covers  the 
5’UTR,  C,  prM,  M,  E,  NS1,  and  NS2  regions.  The  qRT-PCR  assays  were  performed  in  25  pi  reactions 
using  the  Superscript  III  platinum  One-step  qRT-PCR  Kit  (ThermoFisher)  and  2  pi  of  sample  RNA  was 
used.  Amplification  was  performed  following  conditions  as  previously  described41.  10-fold  dilutions  of 
the  ZIKV  RNA  transcripts  (5  pl/reaction)  were  used  to  create  a  standard  curve  for  quantification  of  ZIKV 
GE/pl  of  RNA. 

Amplicon-based  Zika  virus  sequencing 

ZIKV  sequencing  at  TSRI  was  performed  using  an  amplicon-based  approach,  as  described2".  Briefly, 
cDNA  was  reverse  transcribed  from  5  pi  of  RNA  using  Superscript  IV  (Invitrogen).  ZIKV  cDNA  (2.5 
pl/reaction)  was  amplified  in  35  x  400  bp  fragments  from  two  multiplexed  PCR  reactions  using  Q5  DNA 
High-fidelity  Polymerase  (New  England  Biolabs).  The  amplified  ZIKV  cDNA  fragments  (50  ng)  were 
prepared  for  sequencing  using  the  Kapa  Hyper  prep  kit  (Kapa  Biosystems)  and  SureSelect  XT2  indexes 
(Agilent).  Agencourt  AMPure  XP  beads  (Beckman  Coulter)  were  used  for  all  purification  steps.  Paired- 
end  251  nt  reads  were  generated  on  the  MiSeq  using  the  V2  500  cycle  or  V3  600  cycle  kit  (Illumina). 

Trimmomatic  was  used  to  remove  primer  sequences  (first  22  nt  from  the  5’  end  of  the  reads)  and  bases  at 
both  ends  with  Phred  quality  score  <2042.  The  reads  were  then  aligned  to  the  complete  genome  of  a  ZIKV 
isolate  from  the  Dominican  Republic,  2016  (GenBank:  KU853012)  using  Novoalign  v3.04.04 
(www.novocraft.com).  Samtools  was  used  to  sort  the  aligned  BAM  files  and  to  generate  alignment 
statistics43.  Snakemake  was  used  as  the  workflow  management  system44.  The  code  and  reference  indexes 
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for  the  pipeline  can  be  found  at  https://github.com/andersen-lab/zika-pipeline.  ZIKV-aligned  reads  were 
visually  inspected  using  Geneious  v9.1.54:>  before  generating  consensus  sequences.  A  minimum  of  3x 
read-depth  coverage,  in  support  of  the  consensus,  was  required  to  make  a  call. 

The  consensus  ZIKV  sequences  from  FL01M  and  FL03M  generated  by  sequencing  35  x  400  bp 
amplicons  on  the  MiSeq  were  validated  using  the  following  approaches:  1)  sequencing  the  35  x  400  bp 
amplicons  on  the  Ion  S5  platform,  2)  sequencing  amplicons  generated  using  an  Ion  AmpliSeq  panel 
customly  targeted  towards  ZIKV  on  the  Ion  S5  platform,  and  3)  sequencing  5  x  2,150-2,400  bp  ZIKV 
amplicons  on  the  MiSeq.  For  Ion  library  preparation,  cDNA  was  synthesized  using  the  Superscript  VILO 
kit  (ThermoFisher).  ThermoFisher  designed  875  custom  ZIKV  primers  to  produce  75  amplicons  of  -200 
bp  in  two  PCR  reactions  for  use  with  their  Ion  AmpliSeq  Library  Kit  2.0.  The  reagent  FuPa  was  used  to 
digest  the  modified  primer  sequences  after  amplification.  The  DNA  templates  were  loaded  onto  Ion  520 
chips  using  the  Ion  Chef  and  sequenced  one  the  Ion  S5  with  the  200  bp  output  (ThermoFisher).  The  35  x 
400  bp  amplicons  generated  for  the  MiSeq  as  described  above  were  introduced  into  the  Ion  workflow 
using  the  Ion  AmpliSeq  Library  Kit  2.0,  but  without  fragmentation.  Primers  to  amplify  2,150-2,400  bp 
ZIKV  fragments  were  kindly  provided  by  Dawn  Dudly,  Shelby  O’Connor,  and  Dane  Gellerup  (AIDS 
Vaccine  Research  Laboratory,  University  of  Wisconsin,  Madison).  Each  fragment  was  amplified 
individually  by  PCR  using  the  cDNA  generated  above,  Q5  DNA  High-fidelity  Polymerase,  and  the 
following  thermocycle  conditions:  55 
and  68 

Agencourt  AMPure  XP  beads,  sheared  to  300  to  400  nt  fragments  using  the  Covaris  S2  sonicator,  indexed 
and  prepared  for  sequencing  as  described  above,  and  sequenced  using  the  MiSeq  V2  500  cycle  kit 
(paired-end  251  nt  reads).  Compared  to  the  consensus  sequences  generated  using  35  x  400  bp  amplicons 
on  the  MiSeq,  there  were  no  consensus-level  mismatches  in  the  CDS  using  any  of  the  other  three 
approaches  (Extended  Data  Table  2).  There  were,  however,  some  mismatches  in  the  5’  and  3’ 
untranslated  regions  (where  the  genomic  RNA  is  heavily  structured),  likely  a  result  of  PCR  bias  and 
decreased  coverage  depth. 

Enrichment-based  Zika  virus  sequencing 

ZIKV  sequencing  at  USAMRIID  was  performed  using  a  targeted  enrichment  approach.  Sequencing 
libraries  were  prepared  using  the  TruSeq  RNA  Access  Library  Prep  kit  (Illumina)  with  custom  ZIKV 
probes.  Our  probe  set  included  866  unique  probes  each  of  which  was  80  nt  in  length  (Supplementary 
Table  3).  The  probes  were  designed  to  cover  the  entire  ZIKV  genome  and  to  encompass  the  genetic 
diversity  present  on  GenBank  on  01/14/2016.  In  total,  26  ZIKV  sequences  were  used  during  probe  design 
(Supplementary  Table  3).  Extracted  RNA  was  fragmented  at  94°C  for  0-60  seconds  and  each  sample  was 
enriched  separately  using  a  quarter  of  the  reagents  specified  in  the  manufacturer’s  protocol.  Samples  were 
barcoded,  pooled  and  sequenced  using  the  MiSeq  Reagent  kit  v3  (Illumina)  on  an  Illumina  MiSeq  with  a 
minimum  of  2  x  151  bp  reads.  Dual  indexing,  with  no  overlapping  indices,  was  used. 

The  random  hexamer  associated  with  read  one  and  the  Illumina  adaptors  were  removed  from  the 
sequencing  reads  using  Cutadapt  vl.9.devl46,  and  low-quality  reads/bases  were  filtered  using  Prinseq-lite 
vO.20.347.  Reads  were  aligned  to  a  reference  genome  (KX 197 192.1)  using  Bowtie2  v2.0.648,  duplicates 
were  removed  with  Picard  (http://broadinstitute.github.io/picard),  and  a  new  consensus  was  generated 
using  a  combination  of  Samtools  v0.1.1843  and  custom  scripts.  Only  bases  with  Phred 
quality  score  >20  were  utilized  in  consensus  calling,  and  a  minimum  of  3X 
read-depth  coverage,  in  support  of  the  consensus,  was  required  to  make  a 
call;  positions  lacking  this  depth  of  coverage  were  treated  as  missing  [i.e. 
called  as  “N”). 
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Phylogenetic  analyses 

All  published  and  available  complete  ZIKV  genomes  of  the  Asian  genotype  from  the  Pacific  and  the 
Americas  were  retrieved  from  GenBank  public  database  as  of  December  2016.  Sequences  (n=65)  were 
aligned  together  with  ZIKV  genomes  generated  in  this  study  (n=39)  using  MAFFT49  and  manual  curation. 
The  multiple  alignment  contained  104  ZIKV  sequences  collected  between  2013  and  2016,  from  the 
Pacific  (American  Samoa,  French  Polynesia,  and  Tonga),  Brazil,  other  South  and  Central  Americas 
(Guatemala,  Mexico,  Suriname,  and  Venezuela),  the  Caribbean  (Dominican  Republic,  Guadeloupe,  Haiti, 
Martinique,  and  Puerto  Rico),  and  the  United  States. 

In  order  to  determine  the  temporal  signal  of  the  sequence  dataset,  a  maximum  likelihood  phylogeny  was 
first  reconstructed  with  PhyML50  using  the  Hasegawa-Kishino-Yano  (HKY)  nucleotide  substitution 
model51  and  gamma  distributed  rates  amongst  sites52,  which  was  identified  as  the  best  fitting  model  for 
ML  inference  by  jModelTest253.  Then,  a  correlation  between  root-to-tip  genetic  divergence  and  date  of 
sampling  was  conducted  in  TempEst54. 

Bayesian  phylogenetic  analyses  were  performed  using  BEAST  v.1.8.421  to  infer  time-structured 
phylogenies.  We  used  a  SDR06  nucleotide  substitution  model5'’  with  a  non-informative  continuous  time 
Markov  chain  reference  prior56  on  the  molecular  clock  rate.  Replicate  analyses  using  multiple 
combinations  of  molecular  clock  and  coalescent  demographic  models  were  explored  to  select  the  best 
fitting  model  by  marginal  likelihood  comparison  using  path-sampling  and  stepping-stone  estimation 
approaches57-59  (Extended  Data  Table  3).  The  best  fit  model  was  a  strict  molecular  clock  along  with  a 
Bayesian  skyline  demographic  model60.  All  the  Bayesian  analyses  were  run  for  30  million  Markov  chain 
Monte  Carlo  steps,  sampling  parameters  and  trees  every  3000  generations. 

An  initial  phylogeny,  rooted  on  lineages  previously  sampled  in  French  Polynesia  (Extended  Data  Fig.  2a) 
resulted  in  a  strong  correlation  between  ZIKV  genetic  divergence  and  sampling  time  (Extended  Data  Fig. 
2b). 

Expected  number  and  distribution  of  local  cases  from  Zika  virus  importations 

We  used  branching  process  theory61'62  to  generate  the  offspring  distribution  (subsequent  local  cases)  that 
is  expected  from  a  single  introduction.  The  offspring  distribution  is  modelled  with  a  negative  binomial 
distribution  with  mean  R0  and  over-dispersion  parameter  k.  The  total  number  of  cases,  L,  that  is  caused  by 
a  single  importation  (including  the  index  case)  after  an  infinite  time6'1  has  the  following  form: 


L  = 


 T(fr  7+/-1) 


(*r 


r(fcy)r(y+1)  j1+^j w-1 


The  parameter  k  represents  the  variation  in  the  number  of  secondary  cases  generated  by  each  case  of 
ZIKV61.  In  the  case  of  vector  borne  diseases,  local  heterogeneity  is  high  due  to  a  variety  of  factors  such  as 
mosquito  population  abundance,  human  to  mosquito  interaction,  and  control  interventions64'65.  Here,  we 
assume  high  heterogeneity  (k=0.1)  following  previous  estimates  for  vector  borne  diseases62,  though  we 
conducted  sensitivity  analyses  with  more  homogeneous  transmission  (k=l).  This  distribution  L  is  plotted 
in  Extended  Data  Fig.  4a.  Throughout  the  following,  we  take  a  forward  simulation  approach,  drawing 
random  samples  from  this  distribution.  All  estimates  are  based  on  100,000  random  simulations. 

We  use  this  formula  to  estimate  the  probability  of  observing  244  local  cases  in  Miami-Dade  county 
alongside  322  travel-related  cases.  We  approach  this  by  sampling  322  introduction  events  from  L  and 
calculating  the  total  number  of  local  cases  in  the  resulting  outbreak  (Extended  Data  Fig.  4b).  We  also 
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calculate  the  probability  of  observing  244  or  fewer  local  cases  in  the  total  outbreak  (Extended  Data  Fig. 
4c),  finding  that  R0  must  be  0.6  or  smaller  to  have  at  least  a  1%  chance  of  observing  244  or  fewer  local 
cases.  As  a  sensitivity  analysis,  we  additionally  model  introductions  with  the  assumption  that  only  50%  of 
travelers  are  infectious  at  time  of  arrival  into  Miami-Dade  County. 

We  further  use  this  formula  to  address  the  probability  of  observing  3  distinct  genetic  clusters  (FI,  F2  and 
F3)  representing  3  introduction  events  in  a  sample  of  27  ZIKV  genomes  from  Miami-Dade  county.  We 
approach  this  by  sampling  introduction  events  until  we  accumulate  244  local  cases  according  to  L, 
arriving  at  N  introduction  events  with  case  counts  (ju  j2,  ...  /v).  We  then  sample  27  cases  without 
replacement  from  (j\,j2,  ...  jA)  following  a  hypergeometric  distribution  and  record  the  number  of  distinct 
clusters  drawn  in  the  sample.  We  find  that  higher  values  of  RQ  result  in  fewer  distinct  clusters  within  the 
sample  of  27  genomes  (Extended  Data  Fig.  4d).  We  additionally  calculate  the  probability  of  sampling  3  or 
fewer  distinct  genetic  clusters  in  27  genomes  (Extended  Data  Fig.  4e),  finding  that  R0  must  be  greater 
than  0.6  to  have  at  least  a  1%  chance  of  observing  3  or  fewer  distinct  genetic  clusters.  Additionally,  as  a 
sensitivity  analysis  we  model  a  preferential  sampling  process  in  which  larger  clusters  are  more  likely  to 
be  drawn  from  than  smaller  clusters.  Here,  we  use  a  parameter  a  that  enriches  the  hypergeometric 
distribution  following  (/ ] ",  j2,  ...jNa). 

Under  strict  assumptions  (100%  of  travelers  infectious  and  random  sampling),  we  find  R0  of 
approximately  0.6  to  be  most  consistent  with  both  data  sources.  Under  relaxed  assumptions  (50%  of 
travelers  infectious  and  preferential  sampling),  we  find  R0  of  0.4-0.9  are  loosely  compatible  with  the  data 
(Extended  Data  Fig.  4). 

We  additionally  perform  birth-death  stochastic  simulations  assuming  a  serial  interval  with  mean  20 
days12.  We  record  the  number  of  stochastic  simulations  still  persisting  after  a  particular  number  of  days 
for  different  values  of  R0  (Fig.  2c). 

Incidence  and  attack  rates 

Weekly  suspected  and  confirmed  ZIKV  case  counts  from  countries  and  territories  within  the  Americas 
with  local  transmission  (January  1st  to  September  18th,  2016)  were  obtained  from  the  Pan  American 
Health  Organization  (PAHO)66.  In  most  cases,  the  weekly  case  numbers  per  country  were  only  reported  in 
bar  graphs,  therefore  we  used  WebPlotDigitizer  v3.10  (http://arohatgi.info/WebPlotDigitizer)  to  estimate 
the  numbers.  Country  and  territory  total  population  sizes  to  calculate  weekly  and  monthly  ZIKV 
incidence  rates  were  also  obtained  from  PAHO67.  Incidence  rates  calculated  from  countries  and  territories 
in  the  Americas  during  January  to  June,  2016  (based  on  the  earliest  introduction  time  estimates  until  the 
first  known  cases)  were  used  an  estimate  for  infection  likelihood  to  investigate  sources  of  ZIKV 
introductions. 

To  remove  any  potential  reporting  biases  with  incidence  rates,  we  also  used  ZIKV  attack  rates  (/.<?., 
proportion  infected  before  epidemic  burnout)  to  estimate  infection  likelihood.  Attack  rates  were 
calculated  using  a  susceptible-infected-recovered  (SIR)  transmission  model  derived  from  seroprevalence 
studies  and  environmental  factors  as  described611.  Using  attack  rates  as  an  estimate  of  infection  likelihood, 
we  predict  that  -60%  of  the  infected  travelers  entering  Miami  came  from  the  Caribbean  (Extended  Data 
7b),  which  is  in  agreement  with  our  methods  using  incidence  rates  (Fig.  3c).  A  list  of  countries  and 
territories  used  in  these  analysis  can  be  found  in  Supplementary  Table  2. 

Airline  and  cruise  ship  traffic 

To  investigate  whether  the  transmission  of  ZIKV  in  Florida  coincides  with  travel  patterns  from  ZIKV 
endemic  regions,  we  obtained  the  number  of  passengers  arriving  at  airports  in  Florida  via  commercial  air 
travel.  We  collated  flight  data  from  countries  and  territories  in  the  Americas  with  local  ZIKV 
transmission  between  January  and  June,  2016  (based  on  the  earliest  introduction  time  estimates  until  the 
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first  known  cases,  Supplementary  Table  2),  arriving  at  all  commercial  aiiports  in  Florida.  The  data  were 
obtained  from  the  International  Air  Transportation  Association,  which  collects  data  on  an  estimated  90% 
of  all  passenger  trips  worldwide.  We  used  previously  reported  flight  data  from  October  2014  through 
September  2015  to  compare  traffic  patterns  across  major  United  States  airports23. 

Schedules  for  cruise  ships  visiting  Miami,  Port  Canaveral,  Port  Everglades,  Fort  Lauderdale,  Key  West, 
Jacksonville  (all  in  Florida),  Houston,  Galveston  (both  in  Texas),  Charleston  (South  Carolina)  and  New 
Orleans  (Louisiana)  ports  in  the  year  2016  were  collated  from  www.cruisett.com  and  confirmed  by  cross- 
referencing  ship  logs  reported  by  Port  of  Miami  and  reported  ship  schedules.  Cruise  ship  capacities  were 
extracted  from  www.cruisemapper.com.  Every  country/territory  with  ZIKV  transmission  visited  by  a 
cruise  ship  10  days  (the  approximate  mean  time  to  ZIKV  clearance  in  human  blood  \i.e.,  the  infectious 
period])69  prior  to  arrival  to  Port  of  Miami  was  counted  as  contributing  the  ship’s  capacity  worth  of 
passengers  to  Miami  to  the  month  of  arrival  (Supplementary  Table  2). 

Expected  number  of  travelers  infected  with  Zika  virus 

We  estimated  the  expected  number  of  travelers  entering  Miami  who  were  infected  with  ZIKV  (A.)  by 
using  the  total  travel  capacity  (C)  and  the  likelihood  of  ZIKV  infection  (infections  (I)  per  person  (AO) 
from  each  country/territory  (/): 

We  summed  the  number  of  expected  infected  travelers  from  each  country/territory  with  ZIKV 
transmission  by  region  and  travel  method  (flights  or  cruises).  We  reported  the  data  as  the  relative 
proportion  of  infected  travelers  from  each  region  (Fig.  3c,  Extended  Data  Fig.  7a)  and  as  the  absolute 
number  of  infected  travelers  (Fig.  3d,  Extended  Data  Fig.  7b,  Supplementary  Table  2). 
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Figure  Legends 


Month  0(2016 

Figure  1  I  Zika  virus  outbreak  in  Florida,  (a)  Weekly  counts  of  confirmed  travel-associated  and  locally 
acquired  ZIKV  cases  in  2016.  (b)  Four  counties  reported  probable  locally-acquired  ZIKV  cases  in  2016: 
Miami-Dade  (243),  Broward  (1),  Palm  Beach  (5),  Pinellas  (1),  and  undetermined  (6).  Sequence  data 
(seqs)  were  generated  from  the  only  case  reported  in  Pinellas  County  and  from  28  human  cases  and  7  Ae. 
aegypti  samples  from  Miami-Dade  County,  (c)  The  locations  of  mosquito  traps  and  collected  Ae.  aegypti 
mosquitoes  found  to  contain  ZIKV  RNA  (ZIKV+)  in  relation  to  the  transmission  zones  within  Miami. 
Additional  trap  sites  outside  of  this  area  are  not  shown,  (d)  Temporal  distribution  of  weekly  ZIKV  cases 
(top,  left  y-axis),  sequenced  cases  (bottom),  and  Ae.  aegypti  abundance  per  trap  night  (right  y-axis) 
associated  with  the  three  described  transmission  zones.  ZIKV  cases  and  sequences  are  plotted  in  relation 
to  onset  dates.  Sequenced  cases  without  onset  dates  are  not  shown.  Human  cases  and  Ae.  aegypti 
abundance  per  week  were  positively  correlated  (Spearman  r  =  0.61,  Extended  Data  Fig.  le). 
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Figure  2  I  Multiple  introductions  of  Zika  virus  into  Florida,  (a)  Maximum  clade  credibility  (MCC) 
tree  of  ZIKV  genomes  sequenced  from  outbreaks  in  the  Pacific  islands  and  the  epidemic  in  the  Americas. 
Tips  are  colored  based  on  collection  location.  The  five  tips  outlined  in  blue  but  filled  with  a  different 
color  indicates  ZIKV  cases  in  the  United  States  associated  with  travel  (fill  color  indicates  the  probable 
origin).  Clade  posterior  probabilities  are  indicated  with  black  circles  inside  white  circles  with  posterior 
probability  of  1  filling  the  entire  white  circle.  The  grey  violin  plot  indicates  the  95%  highest  posterior 
density  (HPD)  interval  for  tMRCA  for  the  epidemic  in  the  Americas  (AM)5.  Lineage  F4  contains  two 
identical  ZIKV  genomes  from  the  same  patient,  (b)  A  zoomed  in  version  of  the  whole  MCC  tree  showing 
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the  collection  locations  of  Miami-Dade  sequences  and  whether  they  were  sequenced  from  mosquitoes 
(numbers  correspond  to  Fig.  lc).  95%  HPD  intervals  are  shown  for  the  tMRCAs  (c)  The  probability  of 
ZIKV  persistence  after  introduction  for  different  R0.  Persistence  is  measured  as  the  number  of  days  from 
initial  introduction  of  viral  lineages  until  their  extinction.  Vertical  dashed  lines  show  the  inferred 
persistence  time  for  lineages  FI,  F2  and  B  based  on  their  tMRCA.  (d)  Total  number  of  introductions 
(mean  with  95%  Cl)  that  contributed  to  the  outbreak  of  244  local  cases  in  Miami-Dade  County  for 
different  A\,. 
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Figure  3  I  High  probability  of  Zika  virus  introductions  into  Miami  from  the  Caribbean,  (a)  Reported 
ZIKV  cases  from  January  to  June,  2016  normalized  by  total  population,  (b)  The  number  of  expected 
travelers  entering  Miami  during  January  to  June,  2016  by  method  of  travel,  (c)  The  number  of  travelers 
and  the  reported  ZIKV  incidence  rate  from  the  country/territory  visited  were  used  to  estimate  the 
proportion  of  infected  travelers  coming  from  each  region  with  ZIKV  in  the  Americas,  (d)  The  observed 
number  of  weekly  travel-associated  ZIKV  cases  in  Florida  were  plotted  with  the  expected  number  of 
ZIKV-infected  travelers  (as  estimated  in  panel  c)  coming  from  all  of  the  Americas  (grey  line)  and  the 
regional  contributions  (colored  areas),  (e)  The  countries  visited  by  236  travel-associated  ZIKV  cases 
reported  from  Miami-Dade  county,  (f)  The  number  of  travelers  per  month  (circles)  entering  Florida  cities 
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via  flights  and  cruises  were  plotted  with  relative  Ae.  ciegypti  abundance.  Only  ports  receiving  >10,000 
passengers  per  month  are  shown. 


UNCLASSIFIED 


TR-1 7-042  DISTRIBUTION  STATEMENT  A:  Approved  for  public  release;  distribution  is  unlimited. 


Extended  Data 
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Extended  Data  Fig.  1  I  Miami- Dade  mosquito  surveillance  and  relative  Aedes  aegypti  abundance,  (a) 

Mosquito  surveillance  data  reported  from  June  21  to  November  28,  2016  was  used  to  evaluate  the  risk  of 
ZIKV  infection  from  mosquito-borne  transmission  in  Miami.  A  total  of  24,306  Ae.  aegypti  and  45  Ae. 
albopictus  (none  from  pools  containing  ZIKV  RNA,  data  not  shown  in  table)  were  collected.  Trap  nights 
are  the  total  number  times  each  trap  site  was  used  and  the  trap  locations  are  shown  in  Fig.  Id  (some 
“Other  Miami”  trap  sites  are  not  shown).  Up  to  50  mosquitoes  of  the  same  species  and  trap  night  were 
pooled  together  for  ZIKV  RNA  testing.  The  infection  rates  were  calculated  using  a  maximum  likelihood 
estimate  (MLE).  Miami  Beach  data  includes  both  North  and  South  Miami  Beach,  (b)  Ae.  aegypti 
abundance  per  trap  night  for  each  transmission  zone  was  calculated  using  a  sliding  window  of  3  days 
with  a  1  day  step  to  help  normalize  some  of  the  natural  variation  in  mosquito  abundance  data.  The 
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minimum  temperatures  in  Miami  were  plotted  to  demonstrate  that,  in  general,  mosquito  abundance 
decreases  with  temperature.  The  influence  of  Hurricane  Matthew,  represented  by  the  hurricane  symbol  on 
October  7,  is  difficult  to  determine  because  mosquito  abundance  was  already  low.  (c)  Insecticide  usage, 
including  truck  and  aerial  adulticides  and  larvacides,  by  the  Miami-Dade  Mosquito  Control  in  Wynwood 
(left)  and  Miami  Beach  (right)  was  overlaid  with  Ae.  aegypti  abundance  per  trap  night  to  demonstrate  that 
intense  usage  of  insecticides  may  have  helped  to  reduce  local  mosquito  populations,  (d)  Vector  index,  a 
relative  measure  of  risk  of  mosquito-borne  transmission,  is  the  product  of  mosquito  abundance  and  the 
MLE  infection  rate.  Data  is  shown  as  the  mean  with  95%  Cl  using  a  sliding  window  of  3  days  with  a  1 
day  step.  When  infected  mosquitoes  are  present,  vector  index  is  heavily  influenced  by  mosquito 
abundance,  (e)  If  mosquito  abundance  influences  vector  index,  and  vector  index  is  an  estimate  of  human 
risk37,  then  a  correlation  between  Ae.  aegypti  abundance  and  human  ZIKV  cases  should  be  expected.  The 
number  of  weekly  ZIKV  cases  (based  on  symptoms  onset)  was  correlated  with  mean  Ae.  aegypti 
abundance  per  trap  night  determined  from  the  same  week  and  zone  (Fig.  le,  Spearman  r  =  0.61).  (f) 
Relative  Ae.  aegypti  abundance  for  each  Florida  county  and  month  was  estimated  using  a  multivariate 
regression  model,  demonstrating  spatial  and  temporal  heterogeneity  for  the  risk  of  ZIKV  infection. 
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Extended  Data  Fig.  2  I  Maximum  likelihood  tree  (a)  and  root-to-tip  regression  (b)  of  Zika  virus 
genomes  from  Pacific  islands  and  the  epidemic  in  Americas,  (a)  Tips  are  coloured  by  location,  labels 
in  bold  indicate  samples  sequenced  in  this  study,  Florida  clusters  FI -4  are  indicated  by  vertical  lines  to 
the  right  of  the  tree,  (b)  Linear  regression  of  tip  dates  against  divergence  from  root  based  on  sequences 
with  known  collection  dates  indicates  that  the  evolutionary  rate  estimate  for  the  entire  ZIKV  phytogeny  is 
1.10x10  3  nucleotide  substitutions/site/year  (subs/site/yr).  This  is  consistent  with  BEAST  analyses  using  a 
strict  molecular  clock  and  a  skyline  tree  prior,  the  best-performing  combination  of  clock  and  demographic 
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model  according  to  marginal  likelihood  estimates,  indicating  that  the  evolutionary  rate  is  1.08x10  3  (95% 
highest  posterior  density:  0.97  -  1.19x10  3)  subs/site/yr.  These  values  are  in  agreement  with  previous 
estimates  calculated  based  on  ZIKV  genomes  from  Brazil5. 
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Extended  Data  Fig.  3  I  Molecular  clock  dating  of  Zika  virus  clades  sequenced  from  Florida. 

Maximum  clade  credibility  (MCC)  tree  of  ZIKV  genomes  collected  from  Pacific  islands  and  the  epidemic 
in  Americas.  Circles  at  the  tips  are  colored  based  on  origin  location.  Clade  posterior  probabilities  are 
indicated  with  black  circles  inside  white  circles  with  posterior  probability  of  1  filling  the  entire  white 
circle.  The  grey  violin  plot  indicates  the  95%  highest  posterior  density  (HPD)  interval  for  the  tMRCA  of 
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the  American  epidemic.  We  estimated  that  the  tMRCA  for  the  ongoing  epidemic  in  the  Americas 
occurred  during  October,  2013  (node  AM,  Extended  Table  1,  95%  BCI:  August,  2013-January,  2014), 
which  is  consistent  with  previous  analysis  based  on  ZIKV  genomes  from  Brazil5  (node  AM,  Fig.  2a). 
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Extended  Data  Fig.  4  I  Estimation  of  basic  reproductive  number  and  number  of  introductions,  (a) 

Probability  distribution  of  total  number  of  cases  caused  by  a  single  introduction  (excluding  the  index 
case)  for  different  values  of  R0.  (b)  Mean  and  95%  uncertainty  interval  for  total  number  of  local  cases 
caused  by  322  introduction  events  for  different  values  of  R0.  (c)  Probability  of  observing  244  of  fewer 
local  cases  with  322  introduction  events  for  different  values  of  R0.  (d)  Mean  and  95%  uncertainty  interval 
for  total  number  of  distinct  clusters  observed  in  22  sequenced  cases  for  different  values  of  R0.  (e) 
Probability  of  observing  3  or  fewer  clusters  in  22  sequenced  cases  for  different  values  of  R0. 


Extended  Data  Fig.  5  I  Weekly  reported  Zika  virus  case  numbers  and  incidence  rates  in  the 
Americas,  (a)  ZIKV  cases  (suspected  and  confirmed)  and  (b)  incidence  rates  (normalized  per  100,000 
population)  are  shown  for  each  country  or  territory  with  available  data  per  epidemiological  week  from 
January  1st  to  September  18th,  2016.  (c)  Each  country  or  territory  with  available  data  is  colored  by  its 
reported  ZIKV  incidence  rate  from  January  to  June,  2016  (the  time  frame  for  analysis  of  ZIKV 
introductions  into  Florida). 
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Extended  Data  Fig.  6  I  Cruise  and  flight  traffic  entering  Miami  from  regions  with  Zika  virus 
transmission.  The  number  of  passengers  entering  Miami,  by  either  (a)  cruises  or  (b)  flights,  from  each 
country  or  territory  in  the  Americas  with  ZIKV  transmission  per  month  (left  panel).  The  center  map  and 
inset  show  the  cumulative  numbers  of  travelers  entering  Miami  during  January  to  June,  2016  (the  time 
frame  for  analysis  of  ZIKV  introductions  into  Florida)  from  each  country  or  territory  per  method  of 
travel,  (c)  The  total  traffic  (i.e.  cruises  and  flights)  is  shown  entering  Miami  per  month,  (d)  Summarizes 
the  contribution  of  each  travel  method  towards  the  cumulative  numbers  of  travelers  entering  Miami  from 
the  Caribbean,  South  America,  and  Central  America  during  January  to  June,  2016. 
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Extended  Data  Fig.  7  I  Expected  number  of  Zika  virus  infected  travelers  from  the  Caribbean  is 
correlated  with  the  total  observed  number  of  travel  infections,  (a)  The  ZIKV  incidence  rates  used  in 
the  study  may  have  been  biased  by  reporting  accuracy.  Therefore,  we  alternately  used  projected  ZIKV 
attack  rates68  (i.e.  predicted  proportion  of  population  infected  before  epidemic  burnout)  along  with  travel 
capacities  to  estimate  the  proportion  of  infected  travelers  entering  Miami  from  each  region  with  ZIKV  in 
the  Americas.  About  60%  of  the  infected  travelers  are  expected  to  have  arrived  from  the  Caribbean, 
similar  to  our  results  using  incidence  rates  (Fig.  3c).  (b)  The  expected  number  of  travel-associated  ZIKV 
cases  were  estimated  by  the  number  of  travelers  coming  into  Miami  from  each  country/territory  (travel 
capacity)  and  the  in-country/territory  infection  likelihood  (incidence  rate  per  person)  per  week.  The 
expected  travel  cases  were  summed  from  all  of  the  Americas  (left),  Caribbean  (left  center),  South 
America  (right  center),  and  Central  America  (right)  and  plotted  with  the  observed  travel-associated  ZIKV 
cases.  Numbers  in  each  plot  indicate  Spearman  correlation  coefficients. 
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Extended  Data  Fig.  8  I  High  potential  for  Zika  virus  introductions  into  Miami  via  cruises,  (a)  The 

monthly  average  (95%  Cl)  number  of  expected  travelers  (i.e.  capacity)  entering  all  Florida  ports  from 
countries/territories  in  the  Americas  with  ZIKV  transmission  during  January  to  June,  2016.  (b)  The 
monthly  average  (95%  Cl)  number  of  travelers  entering  the  major  United  States  ports  via  cruises  from 
countries/territories  in  the  Americas  with  ZIKV  transmission  during  January  to  June,  2016.  Ports  were 
combined  within  states,  (c)  The  monthly  cruise  capacity  for  each  major  United  States  port  (shown  as 
circle  diameter)  with  monthly  potential  Ae.  aegypti  abundance  (circle  color),  as  previously  estimated18. 
Cruise  capacities  from  Houston  and  Galveston,  TX  were  combined. 
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Extended  Data  Table  1  I  Time  of  the  most  recent  common  ancestor. 


Clade  AM  tM RCA  Clade  A  t M RCA  Clade  B  tM RCA 


Model  combination 

Mean 

Lower 

95%  BCI 

Upper 
95%  BCI 

Mean 

Lower 

95%  BCI 

Upper 
95%  BCI 

Mean 

Lower 

95%  BCI 

Upper 
95%  BCI 

Strict,  Constant 

2013.86 

2013.66 

2014.06 

2015.57 

2015.39 

2015.73 

2015.63 

2015.49 

2015.84 

Strict,  Exponential 

2013  88 

2013.66 

2014.06 

2015.58 

2015.40 

2015.72 

2015.68 

2015.52 

2015.84 

Strict,  Bayesian  SkyGrid 

2013.85 

2013.64 

2014.05 

2015.57 

2015.39 

2015.72 

2015.68 

2015.50 

2015.83 

Strict,  Bayesian  Skyline 

2013.90 

2013.69 

2014.08 

2015.54 

2015.38 

2015.73 

2015.67 

2015.47 

2015.83 

UCLN,  Constant 

2013.90 

2013.66 

2014.12 

2015.62 

2015.41 

2015.80 

2015.73 

2015.53 

2015.91 

UCLN,  Exponential 

2013.90 

2013.65 

2014.11 

2015.62 

2015.42 

2015.78 

2015.73 

2015.54 

2015.90 

UCLN,  Bayesian  SkyGrid 

2013.90 

2013.66 

2014.12 

2015.62 

2015.42 

2015.80 

2015.73 

2015.54 

2015.92 

UCLN,  Bayesian  Skyline 

2013.93 

2013.71 

2014.15 

2015.56 

2015.38 

2015.77 

2015.70 

2015.50 

2015.89 

Clade  FI  tMRCA 

Clade  F2  tMRCA 

Model  combination 

Mean 

Lower 

95%  BCI 

Upper 
95%  BCI 

Mean 

Lower 

95%  BCI 

Upper 
95%  BCI 

Strict,  Constant 

2016  14 

2016.00 

2016.26 

2016.15 

2015.97 

2016.29 

Strict,  Exponential 

2016.14 

2016.00 

2016.26 

2016.15 

2015.99 

2016.30 

Strict,  Bayesian  SkyGrid 

2016.14 

2016.00 

2016.26 

2016.15 

2015.97 

2016.29 

Strict,  Bayesian  Skyline 

2016.25 

2016.13 

2016.36 

2016.23 

2016.04 

2016.36 

UCLN,  Constant 

2016  16 

2016.03 

2016.29 

2016.17 

2015.99 

2016.32 

UCLN,  Exponential 

2016.16 

2016.02 

2016.28 

2016.17 

2016.00 

2016.31 

UCLN.  Bayesian  SkyGrid 

2016.16 

2016.01 

2016.28 

2016.18 

2016.00 

2016.32 

UCLN,  Bayesian  Skyline 

2016.26 

2016.13 

2016.37 

2016.24 

2016.04 

2016.38 

Extended  Data  Table  2  I  Validation  of  sequencing  results. 


Sample 

Amplicon  method 

NGS  platform 

Mismatches/nucleotides  covered 

3'  UTR  CDS  5'  UTR 

FL01M 

35  x  400  bp 

Ion  S5 

1/80 

0/10272 

7/252 

75  x  -200  bpb 

Ion  S5 

2/75 

0/10272 

4/205 

5  x  -2,200  bp 

MiSeq 

0/80 

0/10272 

0/32 

FL03M 

35  x  400  bp 

Ion  S5 

3/87 

0/10272 

20/252 

75  x  -200  bpb 

Ion  S5 

4/78 

0/10272 

5/198 

5  x  -2,200  bp 

MiSeq 

0/82 

0/10272 

0/32 

a  Compared  to  the  consensus  genomes  generated  by  sequencing  35  x  400  bp  amplicons  on  the  MiSeq. 
h  Amplicons  produced  using  Ion  AmpliSeq  and  875  custom  ZIKV  primers. 

NGS,  next-generation  sequencing;  UTR,  untranslated  region;  CDS,  coding  sequence. 


Extended  Data  Table  3  I  Model  selection  to  infer  time-structured  phylogenies. 
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Model  combination 

Path  Sampling 

Ranking 

Stepping  Stone 

Ranking 

Strict,  Constant 

-24305.105 

3 

-24307.152 

3 

Strict,  Exponential 

-24310.130 

4 

-24312.460 

4 

Strict,  Bayesian  SkyGrid 

-24301.305 

2 

-24303.142 

2 

Strict,  Bayesian  Skyline 

-24287.785 

1 

-24290.762 

1 

UCLN,  Constant 

-25467.164 

6 

-25470.141 

6 

UCLN,  Exponential 

-25468.457 

7 

-25471.066 

7 

UCLN,  Bayesian  SkyGrid 

-25471.792 

8 

-25474.032 

8 

UCLN,  Bayesian  Skyline 

-25447.942 

5 

-25450.574 

5 

Supplemental  Data 

Supplementary  Table  1.  Summary  of  the  Zika  virus  sequencing  data  produced  in  this  study. 

Supplementary  Table  2.  Epidemiological  data  and  travelers  entering  Miami,  Florida  from  January  to 
June,  2016. 

Supplementary  Table  3.  Probe  sequences  used  for  RNA  Access  targeted  enrichment  of  Zika  virus. 
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